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(57) Abstract: The invention provides an impartial forum operable to match investors with a universe of investment service 
providers, and matches both investors and service providers with a universe of financial resources and alternative investments. One 
^ij or more service provider profiles each having at least one credential are compiled. The credentials of each of the service provider 
profiles are verified and service provider profiles having credentials that cannot be verified are identified. One or more verified 
W service provider profiles are compiled. At least one insurance policy is obtained insuring that the credentials of each of the verified 
service provider profiles are correct. At least one customer profile is received and at least one verified service provider profile that 
Q generally corresponds to the customer profile is identified. Resource and alternative investment profiles generally corresponding 
^ the service provider and investor profiles can also be identified. A tiered matching system can be used to match user profiles to 
^ service provider, resource and/or alternative investment profiles. 
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Visitor sees the Personal Web Page 
containing the user's information including 
the user's availability, The content shown 
depends on the visitor identity (group), the 
presence of the devices, and the User 
current policy (location) 



Visitor sees the default Personal 
Web Page containing the user's 
information including the user's 
availability 



Visitor can perform vinous 
operations (audio/video calls, 
instant messages, offline 
messages, voice mail) • depends 
on the user availability. 



(57) Abstract: A method to allow Internet 
service subscribers to expose a person -specific 
personalization of their "visitor web pages". 
This service allows a subscriber to build 
a dedicated web page that is specifically 
assembled for another person. A visitor, upon 
accessing the web page, has the capability 
to review static and dynamic information 
or to call the subscriber using multimedia 
communications such as Voice-over-IP or send 
the subscriber information. The page may 
contain information that is made available only 
to the subscriber's family, such as the current 
active policy of the subscriber and the phone 
number that is closest to him. Such information 
is made available via the subscriber's policy. 
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(54) Systems and methods for secure transaction management and electronic rights protection 



(57) The present invention provides systems and 
methods for secure transaction management and elec- 
tronic rights protection. Electronic appliances such as 
computers equipped in accordance with the present in- 
vention help to ensure that information is accessed and 
used only in authorized ways, and maintain the integrity, 
availability, and/or confidentiality of the information. 
Such electronic appliances provide a distributed virtual 
distribution environment (VDE) that may enforce a se- 
cure chain of handling and control, for example, to con- 
trol and/or meter or otherwise monitor use of electroni- 
cally stored or disseminated information. Such a virtual 
distribution environment may be used to protect rights 
of various participants in electronic commerce and other 
electronic or electronic-facilitated transactions. Distrib- 
uted and other operating systems, environments and ar- 
chitectures, such as, for example, those using tamper- 
resistant hardware-based processors, may establish 
security at each node. These techniques may be used 
to support an all-electronic information distribution, for 
example, the utilizing the "electronic higher". 
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(57) Abstract: A method of assisting a 
user over a network (120) that enables 
users to evaluate various products and 
services (135) (collectively "products") is 
provided. The products are described in 
one or more dynamically generated data 
table accessible through the network. In a 
preferred embodiment, a user provides over 
1 20 the network filtering decisions that enable the 
system to filter product records to identify a 
subset of relevant products. In addition, the 
user preferably performs graphical pairwise 
comparisons of the characteristics of the 
desired product to indicate the relative 
importance of such characteristics creating 
a unique user profile or a user may choose a 
predefined profile which corresponds to their 
user group. The data generated as a result 
of the pairwise comparisons is converted into 
weights by applying an analytic hierarchy 
process to create an accurate user profile 
which is used to rank a set of alternatives. 
The system applies the profile to perform 
synthesis to rank the products with respect to 
the user's wants and needs. 
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(54) Title: ONLINE METHOD AND COMPUTER SYSTEM 



(57) Abstract 

The present invention relates to an integrated 
on-line method and computer system for providing 
an Internet-linked database of client information, for 
publishing portions of this information on individ- 
ual, semi-custom client web sites, for translating pre- 
determined portions of this information into specific 
formats as required by designated recipients, and for 
transmitting the translated information via the Inter- 
net (102) or private communication channels (120) to 
these designated recipients. The web site publication 
(106) is linked to the creation of an electronically 
readable user information database (114) which can 
submit data to third party systems, including where 
the user is a job seeker, job matching systems to cor- 
relate job requirements with job seeker skills. 
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WE CLAIM: 

1 . A computerized system integrating publication of a user web site 
with the automated submission of information to multiple recipients, said 
system comprising: 

5 a) a web site publisher interactively designing and publishing a 
web site for said user, said web site displaying profile 
information provided by said user and being individually 
addressable by a uniform resource locator from a remote 
internet terminal; 
10 b) an electronically readable database; 

c) a parser to direct said profile information into said 
electronically readable database 

d) a means for said user to designate a recipient of said profile 
information; 

15 e) a translation engine operating upon said database and 

converting said profile information into output data; wherein 
said output data is formatted and organized according to the 
system requirements of a designated recipient; 
f) a means of electronically transmitting said output data to said 

20 designated recipient. 

2. A system according to claim 1, wherein said web site publisher 
provides user with web design objects and templates options, whereby 
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said user can control the appearance and information content of said 
web site. 

3. A system according to claim 1 , wherein said means of 
5 transmitting is via the internet or a private communication interface. 



4. A system according to claim 1 , wherein said user seeks a job and 
said designated recipient seeks to fill a job. 

10 5. A system according to claim 1 , wherein the user is a job seeker 
and the designated recipient is an on-line or employer job board. 

6. A system according to claim 1 , wherein said specifically formatted 
output data includes the uniform resource locator address of said web 
15 site. 



7. A system according to claim 1 , wherein said web publisher 
provides said user with template questions specific according to their 
user profile. 

8. A system according to claim 1 , wherein the profile information 
provided by said user relates to a particular interest of said user which is 
complementary to the interests of said multiple recipients. 
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9. A system according to claim 1 , further comprising a user 
workstation providing a user email account to support the 
communications of said user with said system and said recipients. 

5 10. A system according to claim 9, wherein said user workstation is 
the communications port for the user to provide and update data; and 
create, publish and revise their web sites. 

11. A system according to claim 1 , wherein said designated recipient 
10 is a printer of stationary or business cards. 

12. A system according to claim 4, wherein said web site publisher 
provides templates and graphic objects according to the user profile. 

15 13. A system according to claim 8, wherein said particular interest is 
professional, commercial, intellectual, legal, medical, educational, or 
personal in nature. 

14. A method for integrated publication of a user web site and 
20 distributing user information to multiple recipients, said method 
comprising the steps of: 

a) communicating with said user via the internet and thereby 
receiving profile information; 
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b) publishing said information on a web site, wherein said web 
site is individually addressable using a uniform resource 
locator from a remote internet terminal; 

c) storing said information in an electronically readable database; 

d) accessing said database and translating said stored 
information into a format according to the system 
requirements of a recipient designated by said user; 

e) transmitting said translated information to said designated 
recipient. 

15. A method according to claim 14, wherein said publishing 
incorporates web design objects and custom audio, video, graphics, and 
career-guided textual data according to information and instructions from 
said user. 

15 

1 6. An online, computerized system integrating publication of a user 
web site, the provision of an electronically readable database, and the 
automated provision of data to multiple partners, said system 
comprising: 

20 a) a web-based public request broker for managing 

communications with said user; said broker to determine the 
access privileges and to receive said personal profile 
information and instructions from said user; 
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b) a web site publisher providing a plurality of web site graphical 
templates and profile design options for the display of said 
information, wherein said publisher designs and posts said 
web site according to said information and instructions; 

c) a parser directing said information into an electronically 
readable database; 

d) a partner request broker for managing communications and 
data exchange with said partners; said request broker to 
determine the access privileges and file sharing protocols of 
said partners; 

e) an engine to search the contents of said database according 
to information provided by said partners. 

17. A computer-based system according to claim 16, wherein said 
15 web site publisher integrates a pre-defined set of web design objects 
and custom audio, video, graphics, and textual data in order to create 
semi-custom designed personal resume web sites. 



5 



10 



18. A computer-based system according to claim 16, wherein said 
20 request broker communicates with said partner over the internet or via a 
private communication interface. 
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19. A computer based system according to claim 16, wherein said 
user is a job seeker and said partner is seeking to match a job seeker to 
a job. 

5 20. A computer based system according to claim 16, wherein said 
web site is individually addressable using a uniform resource locator 
from a remote internet terminal. 

21 . A computer based system according to claim 16, wherein said 
10 designated recipient markets goods or services. 
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(54) Title: COLLECTION AND ANALYSIS OF USER PROFILE INFORMATION 
(57) Abstract 

A system is disclosed that facilitates a web-based user network interface is created by obtaining user profile information from a 
database of user profile information. Then, the system gathers behavioral information from the user profile information and statistically 
analyzes the behavioral information to generate graphs indicative of the user's interaction with applications which are presented on a 
display utilizing agent software. Agent software is also utilized to gather user profile information pertaining to application usage and agent 
utilization to determine characteristics of a user for use in tuning a consistent user interface to applications. 
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(57) Abstract 

A system is disclosed that facilitates a web-based data model to support user information capture and storage is created by obtaining 
user profile information, grouping the user profile information in a logical manner, associating a unique name win the grouped user profile 
information, and storing the grouped user profile information and correlated name in a database. Access to the profile information is 
restricted and a customized user interface is created for each application based on the current grouped user profile information. 
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55 1413186 (POTENTIAL? OR PROSPECTIV? OR PROFIL?) (7N) (TARGET? OR CONT- 

ACT? OR PROFIL?) 

56 362769 REFERRAL? OR RECOMMEN DAT ? OR SPEAK? ( 2W) (HIGHLY? OR WELL) OR 

TESTIMONIAL? OR LETTER? ( 2W) INTRODUCT? OR COMPLIMENT? OR PRAI- 
SE? OR HIGH () MARK? 



S7 


770667 MATCH? OR BROKER? OR (BRING? OR' BRUNG? OR BROUGHT? OR LINK? 






OR JOIN?) () TOGETHER? 


S8 


1243347 CONNECT? 


S9 


8000267 COUPL? OR MATE? OR MATING? 


S10 


57756 SPECIF? (3N) CONNECT? OR DEGREE? ( 2W) SEPARAT? OR CONTROL? (3N) - 






INTERACT? OR NETWORK? (2N) (CHAIN? OR CONCATENAT?) 


Sll 


79702 (ACCESS? OR CONTACT? OR COMMUNICAT? OR INTERPERSONAL? )( 7N) - 






(AUTHORIZ? OR AUTHORIS? OR RESTRICT? OR CONFIDENTIAL? OR PRIV- 






AT? OR DISCREET? OR PERSONAL? OR P2P OR PERSON (2W) PERSON OR P- 






EER ( 2W) PEER OR (SHARE? OR SHARING) (2W) (OTHER? OR ANOTHER?)) 


S12 


402504 S1:S11 (10N) (COMPUTER? OR NETWORK? OR SERVER? OR WORKSTATIO- 






N? OR DESKTOP? OR INTERNET? OR ONLINE? OR SOFTWARE? OR WEBSIT- 






E? OR WORLD()WIDE()WEB) 


S13 




2041 SI AND S2:S5 AND S12 


S14 




441 S13 AND S7:S9 


S15 




21 S14 AND (S6 OR S10:S11) 


S16 




2967 S1:S5(10N)S6:S9 AND S12 


S17 




60 S16 AND S10:S11 


S18 




243 S16 AND S6 


S19 




4 S14 AND S18 


S20 




79 S15 OR S17 OR S19 


S21 




51 S20 AND PY<2001 


S22 




4 3 RD (unique items) 


? show 


r files 


File 


2 


:INSPEC 1969-2005/Apr W3 






(c) 2005 Institution of Electrical Engineers 


File 


6 


:NTIS 1964-2005/Apr W3 






(c) 2005 NTIS, Intl Cpyrght All Rights Res 


File 


8 


:Ei Compendex (R) 1970-2005/Apr W3 






(c) 2005 Elsevier Eng. Info. Inc. 


File 


34 


: SciSearch (R) Cited Ref Sci 1 990-2005/Apr W3 






(c) 2005 Inst for Sci Info 


File 


35 


: Dissertation Abs Online 1861-2005/Mar 






(c) 2005 ProQuest Inf o&Learning 


File 


62 


:SPIN(R) 1975-2005/Feb Wl 






(c) 2005 American Institute of Physics 


File 


65 


: Inside Conferences 1993-2005/Apr W4 






(c) 2005 BLDSC all rts. reserv. 



File 94 : JICST-EPlus 1985-2005/Mar W2 

(c)2005 Japan Science and Tech Corp(JST) 

File 95:TEME-Technology & Management 1989-2005/Mar W3 
(c) 2005 FIZ TECHNIK 

File 99: Wilson Appl . Sci & Tech Abs 1983-2005/Mar 
(c) 2005 The HW Wilson Co. 

File 111:TGG Natl . Newspaper Index (SM) 1979-2005/Apr 26 




(c) 2005 The Gale Group 
File 139:EconLit 1969-2005/Apr 

(c) 2005 American Economic Association 
File 144: Pascal 1973-2005/Apr W3 

(c) 2005 INIST/CNRS 
File 256:TecInfoSource 82-2005/Feb 

(c) 2005 Info. Sources Inc 
File 434 : SciSearch (R) Cited Ref Sci 1974-1989/Dec 

(c) 1998 Inst for Sci Info 



0- 



Set Items Description 

51 77556 SEARCH? (7N) (CHARACTERISTIC? OR CRITER? OR QUERY? OR QUERIE? 

OR REQUIREMENT? OR QUALIFICATION? OR ATTRIBUTE? OR REQUISIT?) 

52 332509 SEARCH? (7N) (PROFILE? OR TARGET? OR CONTACT? OR STENCIL? OR 

TEMPLAT? OR INFO OR INFORMAT? OR DATA) 

53 100524 (PROFIL? OR TARGET? OR CONTACT? OR STENCIL? OR TEMPLAT? OR 

CHARACTERISTIC? OR ATTRIBUT?) (7N) (PERSONALIZ? OR PERSONALIS? - 
OR CUSTOMIZ? OR CUSTOMIS? OR INDIVIDUALS? OR INDIVIDUALIS? ) 

54 2063 (PROFIL? OR TARGET? OR CONTACT? OR STENCIL? OR TEMPLAT? OR 

CHARACTERISTIC? OR ATTRIBUT?) (7N) (CUSTOM OR TAILOR) () (MAKE? OR 
MAKING? OR MADE?) 

55 4374618 (POTENTIAL? OR PROSPECTIV? OR PROFIL?) (7N) (TARGET? OR CONT- 

ACT? OR PROFIL?) 

56 3158851 REFERRAL? OR RECOMMENDAT? OR SPEAK? (2W) (HIGHLY? OR WELL) OR 

TESTIMONIAL? OR LETTER? ( 2W) INTRODUCT? OR COMPLIMENT? OR PRAI- 
SE? OR HIGH () MARK? 

57 7330780 MATCH? OR BROKER? OR (BRING? OR BRUNG? OR BROUGHT? OR LINK? 

OR JOIN?) () TOGETHER? 

58 6910565 CONNECT? 

59 13635172 COUPL? OR MATE? OR MATING? 

510 91592 SPECIF? (3N) CONNECT? OR DEGREE? (2W) SEPARAT? OR CONTROL? ( 3N) - 

INTERACT? OR NETWORK? (2N) (CHAIN? OR CONCATENAT?) 

511 1150098 (ACCESS? OR CONTACT? OR COMMUNICAT? OR INTERPERSONAL?) (7N)- 

• (AUTHORIZ? OR AUTHORIS? OR RESTRICT? OR CONFIDENTIAL? OR PRIV- 
AT? OR DISCREET? OR PERSONAL? OR P2P OR PERSON ( 2W) PERSON OR P- 
EER ( 2W) PEER OR (SHARE? OR SHARING) (2W) (OTHER? OR ANOTHER?)) 

512 356964 9 SI: Sll (10N) (COMPUTER? OR NETWORK? OR SERVER? OR WORKSTATIO- 

N? OR DESKTOP? OR INTERNET? OR ONLINE? OR SOFTWARE? OR WEBSIT- 
E? OR WORLD ()WI DE() WEB) 



S13 


6544 


SI (10N)S2:S5 AND S12 


S14 


4900 


S13 AND S7:S9 


S15 


3866 


S14 AND (S7 OR S9) 


S16 


87 


S15 AND S1:S5 (15N)S10:S11 


S17 


81 


S15 AND S1:S5 (15N)S6 


S18 


161 


S16:S17 


S19 


127 


S18 AND PY<2001 


S20 


63 


RD (unique items) 



? show files 

File 9:Business & Industry(R) Jul/1994-2005/Apr 26 

(c) 2005 The Gale Group 
File 13:BAMP 2005/Apr W3 

(c) 2005 The Gale Group 
File 15:ABI/Inform(R) 1971-2005/Apr 27 

(c) 2005 ProQuest Inf o&Learning 
File 16: Gale Group PROMT (R) 1990-2005/Apr 26 

(c) 2005 The Gale Group 
File 20: Dialog Global Reporter 1997-2005/Apr 27 

(c) 2005 The Dialog Corp. 
File 47: Gale Group Magazine DB(TM) 1959-2005/Apr 27 

(c) 2005 The Gale group 
File 75:TGG Management Contents (R) 86-2005/Apr W3 

(c) 2005 The Gale Group 
File 88:Gale Group Business A.R.T.S. 197 6-2005/Apr 26 

(c) 2005 The Gale Group 
File 98:General Sci Abs/Full-Text 1984-2004 /Dec 

(c)'2005 The HW Wilson Co. 
File 141: Readers Guide 1 983-2005/Dec 

(c) 2005 The HW Wilson Co 
File 148:Gale Group Trade & Industry DB 1976-2005/Apr 27 

(c)2005 The Gale Group 
File 160:Gale Group PROMT (R) 1972-1989 



(c) 1999 The Gale Group 
File 239:Mathsci 1940-2005/ Jun 

(c) 2005 American Mathematical Society 
File 267: Finance & Banking Newsletters 2005/Apr 26 

(c) 2005 The Dialog Corp. 
File 268: Banking Info Source 1981-2005/Apr W3 

(c) 2005 ProQuest Inf o&Learning 
File 275: Gale Group Computer DB(TM) 1983-2005/Apr 27 

(c) 2005 The Gale Group 
File 369:New Scientist 1994-2005/Mar W4 

(c) 2005 Reed Business Information Ltd. 
File 370:Science 1996-1999/ Jul W3 

(c) 1999 AAAS 
File 476: Financial Times Fulltext 1982-2005/Apr 27 

(c) 2005 Financial Times Ltd 
File 484 : Periodical Abs Plustext 1 98 6-2005/Apr W4 

(c) 2005 ProQuest 
File 553:Wilson Bus. Abs. FullText 1982-2004 /Dec 

(c) 2005 The HW Wilson Co 
File 610: Business Wire 1999-2005/Apr 27 

(c) 2005 Business Wire. 
File 613: PR Newswire 1999-2005/Apr 27 

(c) 2005 PR Newswire Association Inc 
File 621:Gale Group New Prod.Annou. (R) 1985-2005/Apr 27 

(c) 2005 The Gale Group 
File 624 : McGraw-Hill Publications 1985-2005/Apr 27 

(c) 2005 McGraw-Hill Co. Inc 
File 634: San Jose Mercury Jun 1985-2005/Apr 26 

(c) 2005 San Jose Mercury News 
File 635:Business Dateline(R) 1985-2005/Apr 27 

(c) 2005 ProQuest Inf o&Learning 
File 636:Gale Group Newsletter DB(TM) 1987-2005/Apr 27 

(c) 2005 The Gale Group 
File 647: CMP Computer Fulltext 1988-2005/Apr W2 

(c) 2005 CMP Media, LLC 
File 674: Computer News Fulltext 1989-2005/Apr W3 

(c) 2005 IDG Communications 
File 696: DIALOG Telecom. Newsletters 1995-2005/Apr 26 

(c) 2005 The Dialog Corp. 
File 810:Business Wire 1 98 6-1999/Feb 28 

(c) 1999 Business Wire 
File 813: PR Newswire 1987-1 999/Apr 30 

(c) 1999 PR Newswire Association Inc 
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ABSTRACT 

Many important and useful applications for software agents 
require multiple agents on a network that communicate with 
each other. Such agents must find each other and perform a 
useful joint computation without having to know about every 
other such agent on the network. This paper describes a 
matchmaker system, designed to find people with similar in- 
terests and introduce them to each other. The matchmaker is 
designed to introduce everyone , unlike conventional Internet 
media which only allow those who take the time to speak in 
public to be known. 

The paper details how the agents that make it up the match- 
making system can function in a decentralized fashion, yet 
can group themselves into clusters which reflect their users' 
interests; these clusters are then used to make introductions or 
allow users to send messages to others who share their inter- 
ests. The algorithm uses referrals from one agent to another 
in the same fashion that word-of-mouth is used when people 
are looking for an expert. A prototype of the system has been 
implemented, and results of its use are presented. 

KEYWORDS: agents, collaborative filtering, CSCW, joint 
computation, ecology of computation, user modeling, intelli- 
gent systems, information retrieval, distributed AI, Internet. 

INTRODUCTION 

Software agents are computer programs which attempt to per- 
form some set of tasks autonomously for their users, in a 
trustworthy, personalized fashion. They can be either manu- 
ally programmed by the user, or use techniques from machine 
learning to discover how the user does some task and gradu- 
ally automate it. Examples include mail filtering programs, 
which learn or are told whose mail is valued and whose is not 
[9][ 1 0]; meeting scheduling programs, which learn or are told 
when and with whom to schedule meetings and how flexible 
to be in negotiating (with other agents) for times depending 
on who else is in the meeting [7]; and so forth. Many software 
agents are even designed to be primarily entertaining, perhaps 
with ancillary practical or informative goals [3][1 1]. 

Other agents take more initiative; they actively inform the 
user when they find items that match the user's known inter- 
ests. Often, such agents may not understand the domain of in- 
terest directly, but are instead facilitators that can find other 
people who understand the domain better who can advise. 
Automated collaborative filtering, in which users with similar 



tastes are matched up, is used in systems such as Web- 
hound[9] or HOMR/Ringo [14]. 

While the two agents above match up users' tastes to make 
recommendations, their focus is not explicitly to matchmak- 
ing users and introducing them to each other. The research 
described in this paper is focussed on introducing users who 
are interested in similar topics. There are a number of reasons 
why one might want to do this: 

• People are often working on similar proj ects without real- 
izing it — be it two people down the hall from each other 
reinventing the same wheel, or two doctors both doing re- 
search on similar cases but having no idea that both of 
them are studying the same literature. 

• It is often the case that people need to find an expert in 
some field, but finding such an expert can be difficult and 
time-consuming. Those who are not well "plugged-in" 
via word of mouth can find this even more difficult. 

• There is potential for a great deal of social collaboration 
on the Internet, but it is often underutilized. "Lurkers" 
who read but do not post to mailing lists or newsgroups, 
for example, are an undiscovered resource to the commu- 
nity, invisible because they do not contribute to public 
discussion. 

Current communications systems on the Internet are not well- 
designed for this sort of matchmaking. In almost all media on 
the Internet, only people who take the time to write a piece of 
prose and transmit it somewhere, whether by mail, news, or 
making a Web page, are ever seen by anyone else. Two peo- 
ple who are both working on the same problem, or who share 
an interest, may never know if they themselves are not actu- 
ally writing about it. The matchmaking system described here 
is designed to aid these "lurkers" who are not part of the pub- 
lic discussion nonetheless find each other and establish a 
community. 

Why having multi-agent systems helps 

Many currently-implemented agents use a centralized archi- 
tecture, in which one agent serves either one or many users. 
A centralized architecture has its advantages: for example, if 
there is no effective way for peers to find each other, a cen- 
tralized solution may be the only workable solution. Unfortu- 
nately, there are problems with a centralized architecture: 
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• Scaling such an architecture to large numbers of users is 
difficult; in systems which must correlate user interests, 
for example [14], straightforward approaches to this prob- 
lem generally require a quadratic-order matching step 
somewhere. 

• If the system requires either high availability (due to con- 
stant demand for its services) or high trustability (because 
it handles potentially sensitive information, such as per- 
sonal data), a centralized server provides a single point 
where either accidental failure or deliberate compromise 
can have catastrophic consequences. 

For these reasons and others, many foreseeable future appli- 
cations for software agents involve large numbers of agents 
interacting with each other. Users may have a number of 
agents operating on their behalf, and agents of any particular 
user may have to communicate with other agents elsewhere 
on the network in order to share information. 

Why multi-agent systems are hard to build 

While decentralized, multi-agent systems have several im- 
portant advantages, one of the largest problems with them is 
how agents are supposed to find each other. Each agent 
should not have to know about (and, indeed, probably cannot 
know about) every other agent, user, or resource on the net- 
work. Instead, some mechanism by which agents may locate 
only the useful agents on the network must be arranged. 

There are several relatively straightforward approaches that 
have been used in other networked systems. For example, hi- 
erarchical organization of the entities, as is done with re- 
source records in the Internet domain name system [12] or 
with newsgroup topics in the Usenet [4], can help to reduce 
the inherently quadratic problem into a logarithmic one. 
However, such approaches depend on some inherent organi- 
zational principle that is established in advance, which is nei- 
ther always optimal nor always convenient; for example, con- 
sider the number of crossposted Usenet articles, a clear indi- 
cation that single-inheritance hierarchies are not necessarily a 
good match to the underlying topic space. 

This research focuses on the problems of a matchmaking ser- 
vice, one designed to find groups of people with similar inter- 
ests and bring them together to form coalitions and interest 
groups. We are not explicitly interested here in romantic 
matchmaking between users, for many reasons — the most ob- 
vious being that shared interests do not necessarily mean that 
two people are romantically compatible. The intended scale 
of the matchmaking is that of the entire Internet, an environ- 
ment in which there are potentially millions of users and mil- 
lions of agents corresponding to them. The domain and the 
large number of agents presents difficult coordination prob- 
lems, such as: 

• there is no obvious a priori hierarchy by which to organize 
the agents (why would any one person's interests be at the 
top of any hierarchy? how would we know whom to pick, 
anyway?); 

• asking other agents at random resembles diffusion in a 
gas and is extremely slow — it means each agent could be 
required to ask every agent on the network, guaranteeing 



a solution that scales poorly; and 

• a centralized approach runs into the problems mentioned 
above of quadratic scaling, and also is subject to single- 
point-of-failure problems if the central system either fails 
or is compromised — an important point for an application 
handling potentially sensitive data. 

Finding the right cluster of peer agents: the core idea 

To address these problems, this research considers an overall 
organization which borrows ideas from computational ecolo- 
gy [5], in which agents have only local knowledge, but self- 
organize into larger units. The core ideas in the approach tak- 
en here are to 

• compare the agents* information in a peer-to-peer, decen- 
tralized fashion, 

• use referrals from one agent to another and an algorithm 
resembling hill-climbing to find other, more appropriate 
agents when searching for relevant peers, in order to 

• build clusters or clumps of like-minded agents, and to 

• use these clusters of similar or like-minded agents (whose 
users therefore share similar interests) to introduce users 
to each other and enable cluster- wide messaging between 
users whose interests match. 

• use a persistent agent that runs most of the time, for long 
periods; the user does not start up the agent, get an imme- 
diate result, and shut it down, but instead runs it in the 
background for hours or weeks, while it uses "word of 
mouth" to find and join appropriate groups of agents 
whose users share the same interests. 

How the resulting clusters can be used 

Once agents have formed clusters — an ongoing and continu- 
ous process for real agents on the Internet, due to the scale 
and constantly-changing environment involved — how can 
we use these clusters? There are many applications; this is a 
short summary: 

• Messaging into the group. A user whose agent is in some 
particular group can send a message into the group — ei- 
ther those other agents known directly by the user's agent 
to be in the same cluster, or transitively through all other 
agents in the cluster by following cluster cache informa- 
tion in a flooding algorithm. Thus, given some particular 
granule on the user's local agent, the user could ask his 
agent to send a message to all other agents in the clump of 
which this granule is a member. 

• Introductions. The chain of referrals themselves can be 
useful information, and can be exposed to the user under 
certain circumstances. Not only can the user send mes- 
sage to particular individuals (whether pseudonymously 
or not), but the agent itself can facilitate a "flirtatious" sort 
of introduction in which information can be symmetrical- 
ly and gradually revealed, via cryptographic protocols. 
Users could ask for an explicit introduction to particular 
members of the cluster, or could instruct their agent to ac- 
cept or solicit introductions when it looked like there was 
a particularly good match available. 
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• Finding an expert. By using a combination of messaging 
into the group and introductions, the clusters that a user's 
agent finds itself in can potentially be used to find experts 
on the subject, since presumably such experts (if they, 
too, are running the agent) will have their interests reflect- 
ed in the clustering. Here, a user could prepare a small 
piece of prose, or find some existing message, which talks 
about the subject for which the user wants an expert; the 
clustering algorithm could then generate a granule for this 
grain and attempt to find a suitable cluster. Once found, it 
could start the introduction process to acquaint the ques- 
tioner and the expert. 

What is described in this paper 

The following sections describe the algorithm used in a pro- 
totype of the clustering system, the testing used to evaluate its 
performance, and how this work is integrated into the larger 
goal of automatically building interest groups and coalitions 
on the Internet. 

Note that the algorithms described below are but a small 
piece of the overall task. In particular, since the system han- 
dles sensitive information such as people's interests, fielding 
the system on the Internet requires cryptographic privacy 
safeguards briefly described elsewhere [1][2] and which are 
the subject of current research. Furthermore, as an initial pro- 
totype for testing the efficacy of clustering, no user interface 
is described. The entire system, including such cryptographic 
safeguards, a user interface, and other necessary elements, is 
called Yenta; to avoid confusion, the prototype piece de- 
scribed here is called Yenta-Lite or YL for short. 

THE APPROACH 

The overall goal is to form clusters of agents whose users 
share similar interests. In order to do this, we must answer the 
following questions: 

• What does it mean to have an interest, and how do agents 
know about these interests? 

• How do we determine similarity of interests? 

• How does a particular agent know which other agents to 
contact? 

• How can we form clusters of similar agents? 

What does it mean for a user to have an interest, and 
how do we capture that computationally? 

For the purposes of matching people by their interests, we as- 
sume that these interests are capturable in some computer- 
based form. At the moment, Yenta only deals with text, such 
as electronic mail messages, the contents of various news- 
group articles, the contents of the user's files in a filesystem, 
and so forth. The architecture of Yenta supports somewhat 
different sources of information as well (such as World Wide 
Web hotlists and homepages) — the crucial requirements for 
any interest are a) they are represented in some electronic 
form, hence captured by the computer, and b) there is some 
way of comparing two potential interests and assigning a de- 
gree of similarity between them. 

As currently implemented, Yenta-Lite can examine the con- 
tents of email messages, newsgroup articles, and user files 



that the user has received, read, or written. The tests described 
in this paper used newsgroup articles and email messages 
only, as discussed in the section on evaluating Yenta-Lite 's 
performance. Each individual message, article, or file being 
compared is considered a document, however, since Yenta 
might eventually be comparing nontextual documents, we 
use the term grain to refer to any individual chunk of bits as- 
sociated with a user. 

A user is deemed to have an interest if several grains are sim- 
ilar to each other. Such a collection of similar grains is called 
a granule-. A user may own many granules, each correspond- 
ing to some separate interest; for example, a user who regu- 
larly reads newsgroup articles on dogs and cars would pre- 
sumably have two granules reflecting these disparate inter- 
ests. 

Two users, A and B, are deemed to share an interest if A has 
at least one granule that is similar to at least one of B's gran- 
ule. Two or more users who share an interest are conceptually 
in a cluster at the instant that they both possess similar gran- 
ules; they are actually in clump at the instant their two agents 
discover this. similarity, A diagram illustrating this is below. 



Agent #1 



cluster 




Agent #2 



Suppose we have three users, A, B, and C. Suppose that A and 
B are in a clump, and B and C are in a clump. Are A, B, and 
C all in a clump together? Not necessarily. If A is interested 
in dogs and cars, his associated granules are A dogs and A cars . 
If the other granules are B do g S , B zcbras , C do „ s , and C guitar3 , 
then A, B, and C are all in a clump, because they all share an 
interest in dogs. However, if C do§s was instead C zcbras , then 
we have two clumps, one reflecting A and B's interests in 



dogs, and one reflecting B and C's interest in zebras. B in this 
case is in two clumps, while A and C are each in one clump. 

How do we determine similarity of interests? 

The fundamental assumption behind Yenta's assessment user 
similar of user interests is this: If two users both have several 
documents which are similar to each other, then the users are 
assumed to share an interest themselves. 

In order to function at all, Yenta demands that any two grains 
can be compared to yield some measure of similarity. It is 
also required that this measure be (at least) partially-ordered; 
a floating-point number, for example, which reflects how 
similar two grains are is an acceptable representation. The 
Yenta architecture allows more sophisticated similarities than 
scalar numbers, but Yenta-Lite, and the results reported here, 
use only scalars. At the moment, it is also assumed that this 
comparison operator is reflexive, e.g., that if A's similarity to 
B is 0.74, then B's similarity to A is likewise 0.74. Future 
work may explore the stability of the clustering algorithm in 
the face of nonreflexive comparison operators. 

Since Yehta-Lite's grains are all exclusively textual, we use 
the SMART [15] document system to compare them. 
SMART is designed primarily to index and retrieve docu- 
ments from large collections. It has many possible modes of 
operation; in our use, SMART first stems all words in any 
given document (e.g., removes prefixes and suffixes and oth- 
erwise canonicalizes the text), computes an inverse-frequen- 
cy metric for each word in the document (so that rare words 
with greater power to discriminate two documents from each 
other have greater weight than common words which appear 
in most documents), and computes a vector which describes 
the document based on these. 

When used to index into a large collection of documents, 
SMART normally takes a query, computes the vector associ- 
ated with the query, and dots the resulting query vector with 
the vectors corresponding to each document. Dot products 
which have high scores are reported. In Yenta's case, the que- 
ry is itself a document; therefore, Yenta essentially takes 
pairs of documents, dots them together, and assumes that high 
scores indicate similarity. 

This is not the only way to do this, of course. For example, 
consider WordNet [13], which is a semantic net that allows 
comparing words based on how many links away one word is 
from another, and in what direction (e.g., synonym, antonym, 
superset, etc). Future implementations of Yenta may combine 
SMART and WordNet if the advantages (e.g., possibly more 
resilience in the face of synonyms that rarely co-occur in a 
single document) outweigh the disadvantages (e.g., greater 
semantic "fuzz" in the comparison due to the greater number 
of words investigated in any given document). 

Forming clusters via referrals 

We now come to the heart of the clustering algorithm. Given 
that we have a multiplicity of agents with no central node and 
no hierarchy, how can we reasonably form clusters which re- 
flect the interests of the users? 

The major steps (described in more detail) are: 



• Intra-agent initialization, known as preclustering: Com- 
bine grains into granules within a single agent. 

• Inter-agent initialization, known as bootstrapping: Find at 
least one other agent with which to communicate. 

• Walk referrals and cluster: Form clusters of like-minded 
agents. 

Preclustering 

When an agent first starts running, it must determine what in- 
terests its user possesses. It does this by collecting some sub- 
set of the user's email, newsgroup articles, and files; each 
such item is known as a grain. Each separate grain is consid- 
ered for membership in a growing collection of granules. 

First, each grain is converted into a SMART vector. Next, a 
complete cross-product table is created in which each grain's 
SMART' vector is dotted with each other grain's SMART 
vector; each resulting dot-product p is an entry in the table. 
This is an 0(n 2 ) operation, given that there are n grains in the 
user's collection. The result is a table in upper-triangular 
form, with the main diagonal suppressed (since the main di- 
agonal corresponds to comparing each grain to itself). We 
then compute the mean,/?, and the standard deviation a of all 
of the nonzero entries in this table of p values. Typically, 60% 
of the entries in the table are zero. 

Next, a grain is picked at random to start the process of pre- 
clustering into granules. It is assigned to the first granule, G 0 . 
To grow G 0) we pick a grain g not already in G 0 and compare 
it — by dotting SMART vectors together — to each grain al- 
ready in G 0 ; we compute the mean g of these dot products. 
We repeat this process for all the other grains not in G 0 and 
remember g best> which is the best mean. Then, we see if 

is true, where W is a weight described in the next paragraph. 
If the relation is true, then the grain corresponding to g besJ is 
added to G 0 . When we have made a complete pass through all 
documents not in G 0 , we take a document at random in the 
leftovers and start trying to make granule Gy. 

The weight W is essentially a user-tunable variable. W = 1 
implies that roughly 1 7% of the grains will pass this test when 
compared to a randomly selected granule, since a weight of 1 
corresponds to everything on the high side of one standard 
deviation from the mean; that is: 
100% -6 7% _ ._ 0/ 

— 1 / /o . 

This process of producing granules is relatively time-con- 
suming (it has several 0(n ) steps in it), but must be done 
only once for any given collection of the user's grains, and, 
as shown later, it appears to produce acceptable results. 

In true Yenta, it is assumed that the user will constantly be 
adding grains to his collection as new messages come in or 
new files are created; however, incrementalizing the algo- 
rithm to cope with each added grain is relatively easy: we 
compare each new grain with existing granules for member- 
ship, adding it if it matches well. Otherwise, it is put aside 
with the rest of the unmatched grains, and this pile of un- 
matched grains is occasionally scanned to see if a large 
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enough number of grains are similar that they can form a new 
granule. 

Boostrapping 

The next phase requires finding at least one other agent with 
which to communicate; finding more after that is easier — due 
to other agents' rumor caches — in that it is less likely that we 
will require either ad-hoc heuristics or user intervention. In 
Yenta-Lite, we finess this problem and assume that we can al- 
ways find another agent. Several heuristics are availabie for 
true Yenta, including broadcasts and directed multicasts on 
local network segments to find other agents in the same orga- 
nization, asking a central registry which contains a partial list 
of other known agents, and asking the user for suggestions. 
All of these heuristics have various advantages and disadvan- 
tages, but we shall not pursue them here. 

Data structures used in finding referrals and clusters 

We now come to the step in which the various granules in 
agents form clusters with other granules. For concreteness, 
assume that we have two agents, named A and B, which each 
have a few granules in them, e.g., G A q, G A j, etc. Each agent 
also contains several other data structures: 

• A cluster cache, CC, which contains the names of all oth- 
er agents currently known by some particular agent as be- 
ing in the same cluster. Thus, if agent A knows that its 
granule 1 is similar to granule 3 of agent B, then CC A con- 
tains a notation linking G A j to Ggj. There are two impor- 
tant limits to the storage consumed by such caches: g/ 
("local granules"), the number of separate granules that 
any given agent is willing to remember about itself; and 
g r ("remote granules"), the number of granules this agent 
is willing to remember about other agents. The total size 
of CC is hence gi times g r In Yenta-Lite, these are essen- 
tially unbounded; in an implementation that wishes to 
save space, limiting g r before limiting gj would seem to 
make the most sense, as this limits the total number of 
other agents that will be remembered by the local agent, 
while not limiting the total number of disparate interests 
belonging to the user that may be remembered by the lo- 
cal agent. 

• A rumor cache, RC, which contains the names and other 
information (described below) from the last r agents that 
this agent has communicated with. In Yenta-Lite, r is ar- 
bitrarily set to 5, and it should definitely be bounded in 
true Yenta as well, since otherwise any given agent will 
remember all of the agents it has ever encountered on the 
net and its storage consumption will grow without bound. 
Reasonable values for bounds in real-life operation with 
large numbers of agents are currently unknown, but are 
suspected to be on the order of 20 to 100. 

A pending-contact list, PC, which is a priority-ordered 
list of other agents that have been discovered but which 
the local agent has not yet contacted. 

The rumor cache contains more than just the names of other 
agents encountered on the network. It also contains some sub- 
set, perhaps complete, of the text of each granule correspond- 
ing to those agents. 



The stored granules themselves are essential for the referral 
process. Having the complete text of each granule, or even 
most of it, could represent a large amount of storage (e.g., 
100K or more per granule, depending on exactly what is in 
any given granule). We do not just store the SMART vectors 
because: 

• The Yenta architecture does not require that the compar- 
ison operator be able to handle a "reduced information" 
representation of the two grains to be compared. SMART 
happens to compare two documents by reducing them to 
a pair of vectors before dotting them together, but other 
comparison operators might not produce such a compact 
representation as part of their operation. 

• The Yenta architecture does not enforce a requirement 
that each agent be running identical software, and indeed 
expects that any given pair of agents may be running 
slightly different versions, including different comparison 
functions. There is no telling a priori whether some re- 
duced-information representation of a particular grain 
will be correct for two different comparison operators. 

Note that one might allow the user to choose a reduced- 
information version of each granule, accepting the re- 
duced performance that would result when other agents 
give up on interoperating with the local agent when they 
discover that their comparison operators differ. 

• Having the complete text of each granule represents more 
than a space penalty — it also represents a serious privacy 
problem if some particular agent were to be maliciously 
modified to disgorge both the contents and identity of 
some remote agent. In true Yenta (but not Yenta-Lite), 
this is ameliorated using cryptographic protocols to hide 
information, even in the cache, and also to hide identities 
of the remote agents. 

Getting referrals and doing clustering 

Now that we have all this mechanism in place, performing re- 
ferrals and clustering is relatively uncomplicated. 

The process starts when some agent (call it A) has finished 
preclustering and has found at least one other agent (call it B) 
via bootstrapping. Agent A then performs a comparison of its 
local granules with those of agent B, using a process reminis- 
cent of the preclustering phase but simplified. A builds an up- 
per-triangular matrix describing the similarities between each 
of its local granules and those locally held by B. Then, rather 
than taking averages and standard deviations, it simply finds 
the highest score (e.g., closest similarity) between any given 
granule (say, G AI ) and B's granules. If there is no such value 
above a particular threshold, then the local granule under con- 
sideration does not match any of B's granules, although some 
other local granule, e.g., G A2 , might match. 

The comparison process is simpler in the clustering (inter- 
agent) phase than in the preclustering (intra-agent) phase in 
part because two agents talking to each other cannot assume 
that they have complete information about either each other 
or the space of all possible other granules on the network. 
Thus, we do not bother trying to calculate averages and stan- 
dard deviations; as observed in the prototype, a simpler, 
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threshold-based match appears to work well enough. 

When we are done comparing granules from A with granules 
from B, agent A may have found some acceptably close 
matches. Such matches are entered, one pair of granules at a 
time, in A's cluster cache. B is likewise doing a comparison 
of its granules with A and is entering items in its own cluster 
cache. 

Whether or not any matches were found that were good 
enough to justify entering them in a cluster cache, the next 
step is to acquire referrals to agents that might be better 
matches. In the example here, agent A asks agent B for the en- 
tire contents of its rumor cache, and runs the same sort of 
comparison on those contents that it did on agent B's own lo- 
cal granules. Good matches are added to A's cluster cache, 
the rest of the data is added to A's rumor cache, and A's 
namelist is updated by adding to it those other agents which 
showed good matches to A, that is, those agents which had 
granules that went into A's cluster cache. These agents will 
be contacted next, after A finishes with B and any other en- 
tries in its namelist. The various caches belonging to B that A 
has been consulting were gathered by B in a similar way; ev- 
ery agent participating in this protocol is thus building up a 
collection of data for its own use and for the use of other 
agents. 

This procedure acts somewhat like human word of mouth. If 
Sally asks Joe, "What should I look for in a new stereo?" Joe 
may respond, "I have no idea, but Alyson was talking to me 
recently about stereos and may know better." In effect, this 
has put Alyson into Sally's rumor cache (and, if Joe could 
quote something Alyson said that Sally found appropriate, 
perhaps into Sally's cluster cache as well). Sally now repeats 
the process with Alyson, essentially hill-climbing her way to- 
wards someone with the expertise to answer her question. 

EXPERIMENTAL EVALUATION OF THE ALGORITHM 

To test the algorithm presented above, the Yenta-Lite proto- 
type was implemented. This prototype contains simulates 20 
agents by running them all on a single machine. 

A randomly-chosen mix of newsgroups and mailing list ar- 
chives, comprising 13 megabytes total from 7 sources, were 
used as the grain data for the agents. In particular, the sources 
were comp.ai. philosophy, rec.pyrotechnics, and sci.math 
(Usenet newsgroups), and alive-archive, macmoose-archive, 
physics-archive, and subgenius-archive (two mailing lists 
about programming projects, an announcement list for events 
of interest to physicists, and an aggressively eclectic mailing 
list for members of the Church of the SubGenius). 

Each of the 7 sources was subdivided into smaller files, each 
no more than 150-200K, yielding 64 smaller files total. Thus, 
comp.ai .philosophy was divided into 20 small files, alive-ar- 
chive into 3, and so forth. These smaller files were then ran- 
domly distributed amongst the 20 agents, such that each agent 
received either 3 or 4 of them. 

Preclustering was run for each of the 20 agents, and the re- 
sulting clusters were hand-analyzed to get an idea of what the 
results were. While preclustering, a grain was deemed inter- 
esting enough to create a granule if at least 5% of the other 



grains available in the given agent also participated in the 
granule. Thus, grains which formed granules consisting only 
of themselves or a tiny number of other grains were inhibited. 

By way of illustration, consider the two example agents be- 
low, which were selected randomly from the 20 total. Agent 
1 got two small files from comp.ai.philosophy (the first and 
second of them, ai.l and ai.2) and one from sci.math; 
SMART converted the resulting grains into 180 vectors. Pre- 
clustering yielded 12937 nonzero matrix entries, which were 
39% of the total entries, and formed 5 different granules 
(named 1.1 through 1.5). Human analysis of the resulting 
granules indicated that there was some overlap between the 
subject areas of the two newsgroups (two granules contained 
messages from both newsgroups, for example). Agent 6, on 
the other hand, completely partitioned the SubGenius mailing 
list from the physics mailing list, and further segmented each 
of those into two different subject areas. 

Agent 1: ai.l, ai.2, sm.l, 
180 vectors, p=.072, a - 085, NZ= 12937 (39%) 

1 . 1 ai/sm. Limits of computing power/theoret. comput. 

1.2 ai. Long discourses on fuzzy logic/psychology. 

1.3 ai/sm. Philosophy of N,Z, Q, and R construction. 

1 .4 ai. Books about small towns. 

1.5 sm. Division by zero tricks. 

Agent 6: sg. 1 ,_rp. 1 , ph.2 
68 vectors, p= 112, a=.128, NZ=2131 (46%) 

6. 1 sg. SubGenius random flaming. 

6.2 sg. More SubGenius random stuff; new topic. 

6.3 ph. Drivel from the American Physical Society. 

6.4 ph. Boston area physics calendar; bad physics poetry. 

The distribution of similarity scores was somewhat surpris- 
ing; instead of an expected Gaussian, the curve looked more 
like a blackbody curve. For example, a randomly-selected re- 
sult from comparing one particular grain to a set of others, 
while trying to decide whether to place it into a granule, 
yielded a curve with a mean />=.097, a =096, and the shape 
below. 

Once preclustering was completed, the agents were run in 
random order and allowed to exchange messages. The simu- 
lation was run "to convergence," meaning that agents were 
allowed to continue exchanging messages until no additions 
were made to any agent's cluster cache for hundreds of ex- 
changes — and hence all clusters that were going to form did 
form. This is not the situation that would obtain with true 
Yenta on the Internet, both because the sheer number of 
agents would require a very large number of message ex- 
changes, and because the grains and granules making up each 
individual agent would be under constant change as their us- 
ers received or sent additional messages — hence the system 
could never converge. 

A plot of the number of additions made to the Yenta-Lite run- 
ning at any particular instant vs the message exchange num- 
ber in the entire simulation appears below. 

Convergence was achieved before 800 messages were ex- 
changed between the agents. There was an initial burst in 
which several agents added a large number of granules to 
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their cluster caches, followed by a relative lull, followed by a 
gradual rise and fall in cluster-cache additions. It is not entire- 
ly clear what accounts for the lull around the 100th message 
exchange; it is possible that all the "easy" clustering hap- 
pened early and each agent then had to build up enough of a 
rumor cache and do sufficient hill-climbing using it for 
progress to continue. 

Since this is a static simulation, it is possible to ask how the 
number of messages exchanged during the clustering phase 
compares to a brute-force solution, in which each agent's 
granules are methodically compared to every other granule in 
every other agent. Since the 64 original files turned into 68 to- 
tal granules, such a crossbar would require 68 2 =4624 com- 
parisons if done naively, and 2248 comparisons if one realiz- 
es that the upper triangular part of the crossbar matrix, minus 
the main diagonal, is all that need be computed given a reflex- 



ive comparison function. On average, each message ex- 
change by each Yenta-Lite compared 3.4 granules (68/20) at 
each end of the exchange, so the approximately 750 message 
exchanges performed 2550 comparisons. This is not much 
more work than the brute-force solution would have taken, 
yet it possesses desirable properties that the brute-force cross- 
bar would not: 

• The clusters are grown incrementally for each agent, so at 
any given time, each agent sees at least some of many 
clusters. 

• No agent need retain knowledge of all other granules in 
the system at any time. 

• If a agent were to disappear from the system, the only last- 
ing effect would be for other agents to "forget" it; the rest 
of the clusters would still form. 

Manual inspection of the clusters that resulted from this run 
show that the brute-force crossbar solution and the referral 
solution are essentially identical. 

RELATED WORK 

There are many efforts in distributed AI and multi-agent sys- 
tems which could be considered relevant; here we consider 
only other matchmaking systems and related approaches. 

A common technique in systems that support computation 
amongst a group of users is to centralize a server and have its 
users act like clients. Systems that match user interests to 
each other, and have such a centralized structure, include We- 
bhound [14] and HOMR/Ringo[9]. 

Kuokka and Harada [8] describe a system that matches adver- 
tisements and requests from users and hence serves as a bro- 
kering service. Their system certainly is a matchmaker, but it 
assumes a centralized matchmaker and a highly-structured 
representation of user interests. 

Others have taken a more distributed approach. For example, 
Kautz, Milewski, and Selman [6] report work on a prototype 
system for expertise location in a large company. Their pro- 
totype assumes that users can identify who else might be a 
suitable contact, and use agents to automate the referral- 
chaining process; they include simulated results showing 
how the length and accuracy of the resulting referral chains 
are affected by the number of simulated users and the accura- 
cy and helpfulness of their recommendations. Yenta-Lite dif- 
fers from this approach in using ubiquitous user data to infer 
interests, rather than explicitly asking about expertise. 

CONCLUSIONS AND FUTURE WORK 

Yenta-Lite demonstrates that referral-based matchmaking 
can provide acceptable results without requiring any one 
agent to know about all other agents, and without requiring 
unreasonable messaging traffic or local computation. 

Work is currently proceeding on several aspects of the final 
Yenta design: 

• Implementing the requisite privacy safeguards and user 
interface to permit a networked implementation with real 
user data. 
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• Evaluating the suitability and stability of the clustering al- 
gorithms in the face of hundreds or thousands of instanti- 
ations of the agent in a real environment. 

• Experimenting with different comparison metrics to en- 
hance Yenta's ability to accurately determine a match in 
user interests. 
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Set Items Description 

51 267 AU=(WORK J? OR WORK, J?) 

52 123 (JAMES OR JIM OR JIMMY) (2W)WORK 

53 0 WWW()LINKEDIN()COM OR WWWLINKEDINCOM OR "WWW.LINKEDIN.COM" 

54 143852 { INTERNET? OR NETWORK? OR ONLINE OR COMPUTER OR WORLDWIDEW- 

EB OR WORLD {) WIDE () WEB) (5N) (MATCH? OR BROKER? OR NETWORKING? - 
OR SEARCH?) 

55 0 S1:S2 AND S4 
? show files 

File 2:INSPEC 1969-2005/Apr W3 

(c) 2005 Institution of Electrical Engineers 
File 6:NTIS 1964-2005/Apr W3 

(c) 2005 NTIS, Intl Cpyrght All Rights Res 
File 8:Ei Compendex (R) 1970-2005/Apr W3 

(c) 2005 Elsevier Eng. Info. Inc. 
File 34 :SciSearch(R) Cited Ref Sci 1990-2005/Apr W3 

(c) 2005 Inst for Sci Info 
File 35 : Dissertation Abs Online 1861-2005/Mar 

(c) 2005 ProQuest Inf o&Learning 
File 62:SPIN(R) 1975-2005/Feb Wl 

(c) 2005 American Institute of Physics 
File 65: Inside Conferences 1993-2005/Apr W4 

(c) 2005 BLDSC all rts. reserv. 
File 94 : JICST-EPlus 1 985-2005/Mar W2 

(c)2005 Japan Science and Tech Corp(JST) 
File 95 :TEME-Technology & Management 1989-2005/Mar W3 

(c) 2005 FIZ TECHNIK 
File 99: Wilson Appl . Sci & Tech Abs 1983-2005/Mar 

(c) 2005 The HW Wilson Co. 
File 111:TGG Natl . Newspaper Index (SM) 1979-2005/Apr 26 

(c) 2005 The Gale Group 
File 144: Pascal 1 973-2005/Apr W3 

(c) 2005 INIST/CNRS 
File 256:TecInfoSource 82-2005/Feb 

(c) 2005 Info. Sources Inc 
File 434 :SciSearch (R) Cited Ref Sci 1 974-1989/Dec 

(c) 1998 Inst for Sci Info 
File 475: Wall Street Journal Abs 1 973-2005/Apr 26 

(c) 2005 The New York Times 
File 583:GaIe Group Globalbase ( TM) 198 6-2002/Dec 13 

(c) 2002 The Gale Group 



Set Items Description 

51 38 AU=(WORK J? OR WORK, J?) 

52 4 714 (JAMES OR JIM OR JIMMY) (2W)WORK 

53 91 WWW()LINKEDIN()COM OR WWWLINKEDINCOM OR "WWW.LINKEDIN.COM" 

54 2183933 (INTERNET? OR NETWORK? OR ONLINE OR COMPUTER OR WORLDWIDEW- 

EB OR WORLD () WIDE () WEB) (5N) (MATCH? OR BROKER? OR NETWORKING? - 
OR SEARCH?) 

55 107 S1:S2 AND S3:S4 

56 198 S5 OR S3 

57 74 S6 AND PY<2002 

58 53 RD (unique items) 
? show files 

File 9:Business & Industry(R) Jul/1994-2005/Apr 26 

(c) 2005 The Gale Group 
File 13:BAMP 2005/Apr W3 

(c) 2005 The Gale Group 
File 15 :ABI /Inform (R) 1 971-2005/Apr 26 

(c) 2005 ProQuest Inf o&Learning 
File 16: Gale Group PROMT (R) 1 990-2005/Apr 26 

(c) 2005 The Gale Group 
File 20: Dialog Global Reporter 1997-2005/Apr 27 

(c) 2005 The Dialog Corp. 
File 47: Gale Group Magazine DB(TM) 1959-2005/Apr 27 

(c) 2005 The Gale group 
File 75:TGG Management Contents (R) 86-2005/Apr W3 

(c) 2005 The Gale Group 
File 88:Gale Group Business A.R.T.S. 1976-2005/Apr 26 

(c) 2005 The Gale Group 
File 98:General Sci Abs/Full-Text 1984-2004 /Dec 

(c) 2005 The HW Wilson Co. 
File 141:Readers Guide 1983-2005/Dec 

(c) 2005 The HW Wilson Co 
File 148: Gale Group Trade & Industry DB 197 6-2005/Apr 27 

(c)2005 The Gale Group 
File 160: Gale Group PROMT (R) 1972-1989 

(c) 1999 The Gale Group 
File 239:Mathsci 1 940-2005/ Jun 

(c) 2005 American Mathematical Society 
File 267: Finance & Banking Newsletters 2005/Apr 26 

(c) 2005 The Dialog Corp. 
File 268:Banking Info Source 1981-2005/Apr W3 

(c) 2005 ProQuest Inf o&Learning 
File 275:Gale Group Computer DB(TM) 1983-2005/Apr 27 

(c) 2005 The Gale Group 
File 369:New Scientist 1994-2005/Mar W4 

(c) 2005 Reed Business Information Ltd. 
File 370: Science 1996-1 999/ Jul W3 

(c) 1999 AAAS 
File 476: Financial Times Fulltext 1982-2005/Apr 27 

(c) 2005 Financial Times Ltd 
File 484 : Periodical Abs Plustext 198 6-2005/Apr W3 

(c) 2005 ProQuest 
File 553:Wilson Bus. Abs. FullText 1982-2004 /Dec 

(c) 2005 The HW Wilson Co 
File 610:Business Wire 1 999-2005/Apr 26 

(c) 2005 Business Wire. 
File 613: PR Newswire 1999-2005/Apr 26 

(c) 2005 PR Newswire Association Inc 
File 621:Gale Group New Prod. Annou . (R) 1985-2005/Apr 27 

(c) 2005 The Gale Group 
File 624 : McGraw-Hill Publications 1985-2005/Apr 26 • 



(c) 2005 McGraw-Hill Co. Inc 
File 625: American Banker Publications 1981-2005/Apr 27 

(c) 2005 American Banker 
File 634: San Jose Mercury Jun 1985-2005/Apr 25 

(c) 2005 San Jose Mercury News 
File 635:Business Dateline (R) 1985-2005/Apr 26 

(c) 2005 ProQuest Inf o&Learning 
File 636:Gale Group Newsletter DB(TM) 1987-2005/Apr 27 

(c) 2005 The Gale Group 
File 647: CMP Computer Fulltext 1988-2005/Apr W2 

(c) 2005 CMP Media, LLC 
File 674:Computer News Fulltext 1989-2005/Apr W3 

(c) 2005 IDG Communications 
File 696: DIALOG Telecom. Newsletters 1995-2005/Apr 26 

(c) 2005 The Dialog Corp. 
File 810:Business Wire 1986-1999/Feb 28 

(c) 1999 Business Wire 
File 813: PR Newswire 1987-1999/Apr 30 

(c) 1999 PR Newswire Association Inc 



Set Items Description 

51 38 AU={WORK J? OR WORK, J?) 

52 4714 (JAMES OR JIM OR JIMMY) <2W)WORK 

53 91 WWW()LINKEDIN()COM OR WWWLINKEDINCOM OR "WWW.LINKEDIN.COM" 

54 2183933 (INTERNET? OR NETWORK? OR ONLINE OR COMPUTER OR WORLDWIDEW- 

EB OR WORLD () WIDE () WEB) (5N) (MATCH? OR BROKER? OR NETWORKING? - 

OR SEARCH?) 

55 107 S1:S2 AND S3:S4 

56 198 S5 OR S3 

57 74 S6 AND PY<2002 

58 53 RD (unique items) 

59 0 S3 AND PY<2002 
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ABSTRACT 



A method of querying a user profile commences with a first 
access to a public portion of a knowledge profile for each of 
a plurality of potential targets of the electronic document, 
the public portion of each knowledge profile including 
public knowledge terms indicative of a knowledge base of a 
potential target of the electronic document. The first access 
is responsive to a first query received from an originator. A 
first matching operation is performed between a document 
term within the electronic document and public knowledge 
terms within the public portion of each knowledge profile to 
identify a first set of targets for which a match exists between 
the document term and at least one public knowledge term. 
The first set of targets is published to the originator. Respon- 
sive to a second query from the originator, the private 
portion of a knowledge profile for each of the plurality of 
potential targets of the electronic document is accessed, the 
private portion of each knowledge profile including private 
knowledge terms indicative of a knowledge base of a 
potential target of the electronic document. A second match- 
ing operation between the document term within the elec- 
tronic document and the private knowledge terms within the 
private portion of each knowledge profile is performed to 
identify a second set of targets for which a match exists 
between the document term and at least one private knowl- 
edge term. Each target of the second set of targets is then 
prompted for authorization to be published to the originator. 
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4. Only when the match has a matching metric within a What is claimed is: 

predetermined top number of matches; or 1. A method of querying a knowledge profile, the method 

5. E-mail "knowledge sweep" queries (i.e., an e-mail is including: 

sent to the target when the target is not already recorded responsive to a query, including a query term, from an 

as a potential target and a match occurs). 5 originator, accessing a public portion of a knowledge 

At step 722, each of the targets within the reduced second profile for each of a plurality of potential targets, the 

set of targets is notified of the "knowledge sweep" expanded Public portion of each knowledge profile including 

query. For example, a target may receive an "alert" and public knowledge terms indicative of a knowledge base 

optionally an e-mail notification. In an alternative of a potential target; 

embodiment, a client pop-up notification may be imple- 10 performing a first matching operation between the query 

mented in the event that the target is operating a client tenn and tne P ubUc knowledge terms within the public 

application to the knowledge access server 26. The e-mail P? rtion of * ach knowledge profile to identify a first set 

notification includes (1) a URL that may link to an "alert" of tar S e ' s fc * which a match exists between the query 

page generated by the Web 720 and (2) an expiration interval tenn and at Ieast one P ubhc knowledge term; 

(e.g., the expiration interval specified at step 714 by the 15 P ubllshi ng the first set of targets to the originator; 

originator) for which a response from the target should be responsive to the query from the originator, accessing a 

received. The target may also be presented with the original P riva ! e V°? 10n ofa knowledge profile for each of the 

text of the query (e.g., search terms or selected terms from ^Potential targets of the electronic document, 

an electronic document such as an e-mail) and the name (and the P" v f °f each ^° wl ^g e P' ofile 

(• .1 j a *i , ci j * *i \ c *i_ • • * private knowledge terms indicative of a knowledge 

further details, such as profile details) of the originator so 20 £ ase of a * 

that the originator does not remain anonymous. _c • j . L - • 

a* ♦ i%a *u ♦ * • c u performing a second matching operation between the 

At step 724, the targets may exercise any one of a number ^ and ^ riyale ^ q J g6 terms ^ lhe 

of response opt.ons. For example, the target may return his Vyate iofl of ^ Pledge * ofile tQ identif a 

or her identity to the originator, send an e-mail to the second set of targets for which a match exists between 

originator, refuse the query, suppress the query terms or the 25 me query term and at least one private knowledge term; 

originator, or refer the query to a further target (e .g., a person an d 

or an e-mail address). The refusal of the query comprises a prompting each target of the second set of targets for 

"do nothing" response, and the knowledge site management authorization to be published to the originator, 

server 27 does not return a negative match. In this way, the 2. The method of claim 1 including publishing a specific 

originator is not advised of the identity of a target located by 30 target of the second set of targets to the originator in 

a "knowledge sweep" expanded query if the relevant target response to an authorization grant from the specific target, 

refuses the query. In the case where the query is referred to 3. The method of claim 2 including progressively pub- 

a further target, the actual referral is not propagated or fishing respective targets of the second set to the originator 

communicated to the further target that communicated to the in response to progressive authorizations received from 

originator. The reference, as viewed by the originator, will 35 targets of the second set of targets, 

identify both the further target and the original target that 4. The method of claim 1 wherein each public knowledge 

performed the referral. In this way, referring cannot be term is identified by a confidence level above a predeter- 

performed anonymously. mined minimum threshold. 

At step 726, the progressive return of target identifiers to 5. The method of claim 1 wherein each private knowledge 

the originator, responsive to the "knowledge sweep" query, 40 term is identified by a confidence level below a predeter- 

is displayed to the originator within, for example, a Web mined minimum threshold. 

page dynamically generated for the originator by the Web 6. The method of claim 1 wherein each public knowledge 

server 20 responsive to input from the knowledge site term is identified by a user-specified public designation, 

management server 27. 7. The method of claim 1 wherein each private knowledge 

At step 728, the knowledge site management server 27 45 term is identified by a user-specified private designation, 

pools the referrals to a further target received from multiple 8. The method of claim 1 including removing a specific 

targets in the second set targets. Accordingly, only a single target from the second set of targets based on a query option 

identifier for the further target is displayed to the originator, specified by the originator. 

but will indicate each of the targets of the second set of 9. The method of claim 8 wherein the second query option 

targets that referred to the further target. so specifies a minimum confidence level for a match between 

At decision box 730, a determination is made as to the second query and a knowledge profile, 

whether the "knowledge sweep" expanded query has 10. The method of claim 8 wherein the second query 

expired. If not, the method 700 loops back to step 724. This option specifies a maximum number of targets that may be 

determination is made with reference to the expiration included within the second set of targets, 

interval specified by the originator at step 714. Alternatively, 55 11. The method of claim 8 wherein the query option 

should the query have expired, the query is removed from an specifies a minimum confidence level for a match between 

alert area of a Web page generated for each target of the the query term and the private knowledge terms, 

second set, and the query is archived at step 732. The method 12. The method of claim 1 including withholding publi- 

700 terminates at step 734. cation of a specific target of the second set of targets to the 

Thus, a method and apparatus for querying a profile over 60 originator in response to an authorization denial from the 

a network have been described. Although the present invea- specific target of the second set of targets, 

tion has been described with reference to specific exemplary 13. The method of claim 1 including prompting each 

embodiments, it will be evident that various modifications target of the second set of targets to refer the query to a 

and changes may be made to these embodiments without further target. 

departing from the broader spirit and scope of the invention. 65 14. The method of claim 13 including pooling referrals of 

Accordingly, the specification and drawings are to be the further target received from a plurality of targets of the 

regarded in an illustrative rather than a restrictive sense. second set of targets. 
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15. The method of claim 1 including specifying a query 
expiration interval with respect to the query after which the 
query is retired. 

16. A computer readable medium storing a sequence of 
instructions that, when executed by a computer system, 
cause the computer system to: 

responsive to a query, including a query term from an 
originator, access a public portion of a knowledge 
profile for each of a plurality of potential targets, the 
public portion of each knowledge profile including 
public knowledge terms indicative of a knowledge base 
of a potential target; 

perform a first matching operation between the query term 
and public knowledge terms within the public knowl- 
edge portion of each knowledge profile to identify a 
first set of targets for which a match exists between the 
query term and at least one public knowledge term; 

publish the first set of targets to the originator; 

responsive to the query of the originator, access a private 
portion of a knowledge profile for each of the plurality 
of potential targets of the electronic document, the 
private portion of each knowledge profile including 
private knowledge terms indicative of a knowledge 
base of a potential target; 

perform a second matching operation between the query 
term and the private knowledge terms within the pri- 
vate portion of each knowledge profile to identify a 
second set of targets for which a target exists between 
a knowledge term and at least one private knowledge 
term; and 

prompt each target of the second set of targets for autho- 
rization to be published to the originator. 

17. A system to query a knowledge profile, the system 
including: 

a request handler to receive a query, including a query 
term, from an originator; 

a comparator, responsive to the first query, to access both 
public and private portions of a knowledge profile for 
each of a plurality of potential targets, the public and 
private portions of each knowledge profile including 
respective public and private knowledge terms indica- 
tive of a knowledge base of a potential target, the 
comparator further to perform a first matching opera- 
tion between the query term and public knowledge 
terms within the public knowledge terms within the 
public knowledge portion of each knowledge profile to 
identify a first set of targets for which a match exists 
between the query term and at least one public knowl- 
edge term, and to perform a second matching operation 
between the query term and the private knowledge 
terms within the private portion of the each knowledge 
profile to identify a second set of targets for which a 
match exists between a query term and at least one 
private knowledge term; and 

a notifier to publish the first set of targets to the originator 
and to prompt each target of the second set of targets for 
authorization to be published to the originator. 
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18. The system of 17 wherein the notifier publishes a 
specific target of the second set of targets to the originator 
in response to an authorization grant from the specific target. 

19. The system of claim 17 wherein the notifier is to 
progressively publish respective targets of the second set of 
targets to the originator in response to progressive authori- 
zations received from the respective targets of the second set 
of targets. 

20. The system of claim 17 wherein the comparator is to 
remove a specific target from the second set of targets based 
on a query option specified by the originator. 

21. The system of claim 20 wherein the query option 
specifies a minimum conference level for a private knowl- 
edge term for a match with respect to the query term by the 
comparator. 

22. The system of claim 17 wherein the notifier is to 
withhold publication of a specific target of the second set of 
targets to the originator in response to an authorization 
denial from the specific target of the second set of targets. 

23. The system of claim 17 wherein the notifier is to 
prompt each target of the second set of targets to refer the 
query to a further target. 

24. The system of claim 23 wherein the notifier is to pool 
referrals of the further target received from a plurality of 
targets of the second set of targets. 

25. The system of claim 17 wherein the comparator is to 
retire the query after the expiration of a specified query 
interval. 

26. A system to query a knowledge profile, the system 
including: 

first means for receiving a query, including a query term, 
from an originator; 

second means, responsive to the first query, for accessing 
both public and private portions of a knowledge profile 
for each of a plurality of potential targets, the public 
and private portions of each knowledge profile includ- 
ing respective public and private knowledge terms 
indicative of a knowledge base of a potential target, the 
second means further for performing a first matching 
operation between the query term and public knowl- 
edge terms within the public knowledge terms within 
the public knowledge portion of each knowledge pro- 
file to identify a first set of targets for which a match 
exists between the query term and at least one public 
knowledge term, and for performing a second matching 
operation between the query term and the private 
knowledge terms within the private portion of the each 
knowledge profile to identify a second set of targets for 
which a match exists between a query term and at least 
one private knowledge term; and 

third means for publishing the first set of targets to the 
originator and for prompting each target of second set 
of targets for authorization to be published to the 
originator. 
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ABSTRACT 



The present invention provides systems and methods for 
secure transaction management and electronic rights protec- 
tion. Electronic appliances such as computers equipped in 
accordance with the present invention help to ensure that 
information is accessed and used only in authorized ways, 
and maintain the integrity, availability, and/or confidentiality 
of the information. Such electronic appliances provide a 
distributed virtual distribution environment (VDE) that may 
enforce a secure chain of handling and control, for example, 
to control and/or meter or otherwise monitor use of elec- 
tronically stored or disseminated information. Such a virtual 
distribution environment may be used to protect rights of 
various participants in electronic commerce and other elec- 
tronic or electronic-facilitated transactions. Distributed and 
other operating systems, environments and architectures, 
such as, for example, those using tamper-resistant hardware- 
based processors, may establish security at each node. These 
techniques may be used to support an all-electronic infor- 
mation distribution, for example, utilizing the "electronic 
highway." 
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